Orphelia: predicting genes in metagenomic sequencing reads
نویسندگان
چکیده
Metagenomic sequencing projects yield numerous sequencing reads of a diverse range of uncultivated and mostly yet unknown microorganisms. In many cases, these sequencing reads cannot be assembled into longer contigs. Thus, gene prediction tools that were originally developed for whole-genome analysis are not suitable for processing metagenomes. Orphelia is a program for predicting genes in short DNA sequences that is available through a web server application (http://orphelia.gobics.de). Orphelia utilizes prediction models that were created with machine learning techniques on the basis of a wide range of annotated genomes. In contrast to other methods for metagenomic gene prediction, Orphelia has fragment length-specific prediction models for the two most popular sequencing techniques in metagenomics, chain termination sequencing and pyrosequencing. These models ensure highly specific gene predictions.
منابع مشابه
FragGeneScan: predicting genes in short and error-prone reads
The advances of next-generation sequencing technology have facilitated metagenomics research that attempts to determine directly the whole collection of genetic material within an environmental sample (i.e. the metagenome). Identification of genes directly from short reads has become an important yet challenging problem in annotating metagenomes, since the assembly of metagenomes is often not a...
متن کاملSPA: a short peptide assembler for metagenomic data
The metagenomic paradigm allows for an understanding of the metabolic and functional potential of microbes in a community via a study of their proteins. The substrate for protein identification is either the set of individual nucleotide reads generated from metagenomic samples or the set of contig sequences produced by assembling these reads. However, a read-based strategy using reads generated...
متن کاملTranscriptome analysis of the freshwater pearl mussel, Hyriopsis cumingii (Lea) using illumina paired-end sequencing to identify genes and markers
The transcriptome of triangle sail mussel Hyriopsis cumingii (Lea) using Illumina paired-end sequencing technology was conducted and analyzed. Equal quantities of total RNA isolated from six tissues, including gonad, hepatopancreas, foot, mantel, gill and adductor muscle, were pooled to construct a cDNA library. A total of 58.09 million clean reads with 98.48 % Q20 bases were generated. Cluster...
متن کاملComparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads
To assess the functional capacities of microbial communities, including those inhabiting the human body, shotgun metagenomic reads are often aligned to a database of known genes. Such homology-based annotation practices critically rely on the assumption that short reads can map to orthologous genes of similar function. This assumption, however, and the various factors that impact short read ann...
متن کاملTranscriptome analysis of the freshwater pearl mussel, Hyriopsis cumingii (Lea) using illumina paired-end sequencing to identify genes and markers
The transcriptome of triangle sail mussel Hyriopsis cumingii (Lea) using Illumina paired-end sequencing technology was conducted and analyzed. Equal quantities of total RNA isolated from six tissues, including gonad, hepatopancreas, foot, mantel, gill and adductor muscle, were pooled to construct a cDNA library. A total of 58.09 million clean reads with 98.48 % Q20 bases were generated. Cluster...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 37 شماره
صفحات -
تاریخ انتشار 2009